python package
- Asia > Japan > Honshū > Kantō > Tokyo Metropolis Prefecture > Tokyo (0.05)
- Oceania > Australia (0.05)
- North America > United States > Maryland (0.05)
- (6 more...)
- Energy > Renewable > Solar (0.73)
- Government (0.70)
- Asia > Japan > Honshū > Kantō > Tokyo Metropolis Prefecture > Tokyo (0.05)
- Oceania > Australia (0.05)
- North America > United States > Maryland (0.05)
- (6 more...)
spd-metrics-id: A Python Package for SPD-Aware Distance Metrics in Connectome Fingerprinting and Beyond
We present spd-metrics-id, a Python package for computing distances and divergences between symmetric positive-definite (SPD) matrices. Unlike traditional toolkits that focus on specific applications, spd-metrics-id provides a unified, extensible, and reproducible framework for SPD distance computation. The package supports a wide variety of geometry-aware metrics, including Alpha-z Bures-Wasserstein, Alpha-Procrustes, affine-invariant Riemannian, log-Euclidean, and others, and is accessible both via a command-line interface and a Python API. Reproducibility is ensured through Docker images and Zenodo archiving. We illustrate usage through a connectome fingerprinting example, but the package is broadly applicable to covariance analysis, diffusion tensor imaging, and other domains requiring SPD matrix comparison. The package is openly available at https://pypi.org/project/spd-metrics-id/.
- Africa > Middle East > Egypt (0.05)
- North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)
- Health & Medicine > Therapeutic Area > Neurology (0.99)
- Health & Medicine > Health Care Technology (0.69)
An Empirical Study of Vulnerabilities in Python Packages and Their Detection
Quan, Haowei, Wang, Junjie, Li, Xinzhe, Zhuo, Terry Yue, Chen, Xiao, Du, Xiaoning
In the rapidly evolving software development landscape, Python stands out for its simplicity, versatility, and extensive ecosystem. Python packages, as units of organization, reusability, and distribution, have become a pressing concern, highlighted by the considerable number of vulnerability reports. As a scripting language, Python often cooperates with other languages for performance or interoperability. This adds complexity to the vulnerabilities inherent to Python packages, and the effectiveness of current vulnerability detection tools remains underexplored. This paper addresses these gaps by introducing PyVul, the first comprehensive benchmark suite of Python-package vulnerabilities. PyVul includes 1,157 publicly reported, developer-verified vulnerabilities, each linked to its affected packages. To accommodate diverse detection techniques, it provides annotations at both commit and function levels. An LLM-assisted data cleansing method is incorporated to improve label accuracy, achieving 100% commit-level and 94% function-level accuracy, establishing PyVul as the most precise large-scale Python vulnerability benchmark. We further carry out a distribution analysis of PyVul, which demonstrates that vulnerabilities in Python packages involve multiple programming languages and exhibit a wide variety of types. Moreover, our analysis reveals that multi-lingual Python packages are potentially more susceptible to vulnerabilities. Evaluation of state-of-the-art detectors using this benchmark reveals a significant discrepancy between the capabilities of existing tools and the demands of effectively identifying real-world security issues in Python packages. Additionally, we conduct an empirical review of the top-ranked CWEs observed in Python packages, to diagnose the fine-grained limitations of current detection tools and highlight the necessity for future advancements in the field.
- North America > United States > District of Columbia > Washington (0.05)
- Asia > China > Tianjin Province > Tianjin (0.04)
- Oceania > Australia > Victoria > Melbourne (0.04)
- (2 more...)
PySHRED: A Python package for SHallow REcurrent Decoding for sparse sensing, model reduction and scientific discovery
Ye, David, Williams, Jan, Gao, Mars, Riva, Stefano, Tomasetto, Matteo, Zoro, David, Kutz, J. Nathan
PySHRED is a Python package that implements the SHallow REcurrent D ecoder (SHRED) architecture (Figure 1) and provides a high-level interface for sensing, model reduction and physics discovery tasks. Originally proposed as a sensing strategy which is agnostic to sensor placement [1], SHRED provides a lightweight, data-driven framework for reconstructing and forecasting high-dimensional spatiotemporal states from sparse sensor measurements. SHRED achieves this by (i) encoding time-lagged sensor sequences into a low-dimensional latent space using a sequence model, and (ii) decoding these latent representations back into the full spatial field via a decoder model. Since its introduction as a sparse sensing algorithm, several specialized variants have been developed to extend SHRED's capabilities: SHRED-ROM for parametric reduced-order modeling SINDy-SHRED for discovering sparse latent dynamics and stable long-horizon forecasting Multi-field SHRED for modeling dynamically coupled fields PySHRED unifies these variants into a single open-source, extensible, and thoroughly documented Python package, which is also capable of training on compressed representations of the data, allowing for efficient laptop-level training of models. It is accompanied by a rich example gallery of Jupyter Notebook and Google Colab tutorials.
- North America > United States > Washington > King County > Seattle (0.16)
- Europe > Italy > Lombardy > Milan (0.04)
- Pacific Ocean (0.04)
- (3 more...)
ALT: A Python Package for Lightweight Feature Representation in Time Series Classification
Halmos, Balázs P., Hajós, Balázs, Molnár, Vince Á., Kurbucz, Marcell T., Jakovác, Antal
We introduce ALT, an open-source Python package created for efficient and accurate time series classification (TSC). The package implements the adaptive law-based transformation (ALT) algorithm, which transforms raw time series data into a linearly separable feature space using variable-length shifted time windows. This adaptive approach enhances its predecessor, the linear law-based transformation (LLT), by effectively capturing patterns of varying temporal scales. The software is implemented for scalability, interpretability, and ease of use, achieving state-of-the-art performance with minimal computational overhead. Extensive benchmarking on real-world datasets demonstrates the utility of ALT for diverse TSC tasks in physics and related domains.
Efficient Annotator Reliability Assessment with EffiARA
Cook, Owen, Vasilakes, Jake, Roberts, Ian, Song, Xingyi
Data annotation is an essential component of the machine learning pipeline; it is also a costly and time-consuming process. With the introduction of transformer-based models, annotation at the document level is increasingly popular; however, there is no standard framework for structuring such tasks. The EffiARA annotation framework is, to our knowledge, the first project to support the whole annotation pipeline, from understanding the resources required for an annotation task to compiling the annotated dataset and gaining insights into the reliability of individual annotators as well as the dataset as a whole. The framework's efficacy is supported by two previous studies: one improving classification performance through annotator-reliability-based soft label aggregation and sample weighting, and the other increasing the overall agreement among annotators through removing identifying and replacing an unreliable annotator. This work introduces the EffiARA Python package and its accompanying webtool, which provides an accessible graphical user interface for the system. We open-source the EffiARA Python package at https://github.com/MiniEggz/EffiARA and the webtool is publicly accessible at https://effiara.gate.ac.uk.
- North America > United States > New Mexico > Los Alamos County > Los Alamos (0.04)
- South America > Chile > Santiago Metropolitan Region > Santiago Province > Santiago (0.04)
- Europe > Italy (0.04)
- Research Report (0.50)
- Workflow (0.47)
nabqr: Python package for improving probabilistic forecasts
Jørgensena, Bastian Schmidt, Møller, Jan Kloppenborg, Nystrup, Peter, Madsen, Henrik
We introduce the open-source Python package NABQR: Neural Adaptive Basis for (time-adaptive) Quantile Regression that provides reliable probabilistic forecasts. NABQR corrects ensembles (scenarios) with LSTM networks and then applies time-adaptive quantile regression to the corrected ensembles to obtain improved and more reliable forecasts. With the suggested package, accuracy improvements of up to 40% in mean absolute terms can be achieved in day-ahead forecasting of onshore and offshore wind power production in Denmark. Abbreviations Table 2. 1. Motivation and significance Quantifying predictive uncertainty is a key challenge in many scientific fields that depend on model-based forecasts [1]. Code metadata description Metadata C1 Current code version 0.1 C2 Permanent link to code/repository https://github.com/bast0320/
LLM-IE: A Python Package for Generative Information Extraction with Large Language Models
Objectives: Despite the recent adoption of large language models (LLMs) for biomedical information extraction, challenges in prompt engineering and algorithms persist, with no dedicated software available. To address this, we developed LLM-IE: a Python package for building complete information extraction pipelines. Our key innovation is an interactive LLM agent to support schema definition and prompt design. Materials and Methods: The LLM-IE supports named entity recognition, entity attribute extraction, and relation extraction tasks. We benchmarked on the i2b2 datasets and conducted a system evaluation. Results: The sentence-based prompting algorithm resulted in the best performance while requiring a longer inference time. System evaluation provided intuitive visualization. Discussion: LLM-IE was designed from practical NLP experience in healthcare and has been adopted in internal projects. It should hold great value to the biomedical NLP community. Conclusion: We developed a Python package, LLM-IE, that provides building blocks for robust information extraction pipeline construction.
PyGen: A Collaborative Human-AI Approach to Python Package Creation
Barua, Saikat, Rahman, Mostafizur, Sadek, Md Jafor, Islam, Rafiul, Khaled, Shehnaz, Hossain, Md. Shohrab
The principles of automation and innovation serve as foundational elements for advancement in contemporary science and technology. Here, we introduce Pygen, an automation platform designed to empower researchers, technologists, and hobbyists to bring abstract ideas to life as core, usable software tools written in Python. Pygen leverages the immense power of autoregressive large language models to augment human creativity during the ideation, iteration, and innovation process. By combining state-of-the-art language models with open-source code generation technologies, Pygen has significantly reduced the manual overhead of tool development. From a user prompt, Pygen automatically generates Python packages for a complete workflow from concept to package generation and documentation. The findings of our work show that Pygen considerably enhances the researcher's productivity by enabling the creation of resilient, modular, and well-documented packages for various specialized purposes. We employ a prompt enhancement approach to distill the user's package description into increasingly specific and actionable. While being inherently an open-ended task, we have evaluated the generated packages and the documentation using Human Evaluation, LLM-based evaluation, and CodeBLEU, with detailed results in the results section. Furthermore, we documented our results, analyzed the limitations, and suggested strategies to alleviate them. Pygen is our vision of ethical automation, a framework that promotes inclusivity, accessibility, and collaborative development. This project marks the beginning of a large-scale effort towards creating tools where intelligent agents collaborate with humans to improve scientific and technological development substantially. Our code and generated examples are open-sourced at [https://github.com/GitsSaikat/Pygen]
- Workflow (1.00)
- Research Report > New Finding (0.66)
- Health & Medicine (1.00)
- Information Technology > Security & Privacy (0.67)